-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Investigate and decrease RAM usage - add pebble kvstore #52
Conversation
I've made some changes to the BadgerDB while still working on it, and then also added Pebble. I think we're going to stick with Pebble for now as it has a lower memory footprint. We could go back later, or even better—make the store and its options configurable. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kvstore/pebble/interface.go
Outdated
// --- Store methods --- | ||
Get(key []byte) ([]byte, error) | ||
Set(key, value []byte) error | ||
Delete(key []byte) error | ||
// --- Lifecycle methods --- | ||
Stop() error | ||
// --- Accessors --- | ||
GetAll(prefixKey []byte, descending bool) (keys, values [][]byte, err error) | ||
Exists(key []byte) (bool, error) | ||
// Len returns the number of key-value pairs in the store | ||
Len() (int, error) | ||
// --- Data management --- | ||
ClearAll() error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// --- Store methods --- | |
Get(key []byte) ([]byte, error) | |
Set(key, value []byte) error | |
Delete(key []byte) error | |
// --- Lifecycle methods --- | |
Stop() error | |
// --- Accessors --- | |
GetAll(prefixKey []byte, descending bool) (keys, values [][]byte, err error) | |
Exists(key []byte) (bool, error) | |
// Len returns the number of key-value pairs in the store | |
Len() (int, error) | |
// --- Data management --- | |
ClearAll() error | |
// General KVStore Methods - From the SM(S)T MapStore Interface | |
kvstore.MapStore |
This could be done with the badger
kvstore interface as well and seperate the methods not found in the mapstore interface (like the Backup
/Restore
methods).
If pebble
has any sort of methods like those I would add support for them here as well as this is a standalone package when imported enabling extra features outside of the use of the SM(S)T (even if being used with it) could bring a benefit (even a // TODO: Expand interface features
will be fine for now.
defer iter.Close() | ||
for iter.First(); iter.Valid(); iter.Next() { | ||
if err := store.db.Delete(iter.Key(), pebble.Sync); err != nil { | ||
return errors.Join(ErrPebbleClearingStore, err) | ||
} | ||
} | ||
if err := iter.Error(); err != nil { | ||
return errors.Join(ErrPebbleClearingStore, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random thought, would it be faster to just zero the db file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't zero/rm the file(s) if the database is still open.
In RelayMiner, I proposed not to delete the keys anymore (making this function obsolete) because we remove the database itself in the next step.
So you're correct - we don't need to delete the keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to remember this being a separate repo means it shouldn't solely focus on Shanon so keep it in - for the interface alone mark it with a potential todo to see if theres a faster way to clear it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are ways to optimize. For example we can use Batch + DeleteRange. Will add a comment.
count := 0 | ||
iter, _ := store.db.NewIter(nil) | ||
defer iter.Close() | ||
for iter.First(); iter.Valid(); iter.Next() { | ||
count++ | ||
} | ||
if err := iter.Error(); err != nil { | ||
return 0, errors.Join(ErrPebbleGettingStoreLength, err) | ||
} | ||
return count, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding a len uint64
field to the pebbleKVStore
struct and like in a linked list keeping track of the length through the Set
and Delete
and ClearAll
methods etc so we don't have to error here or iterate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that in the future if there are performance issues on relay miner. That also kind-of kills a point of the database being a source of truth.. I'd keep it as is.
I don't think we use Len()
anywhere though, I was tempted to delete it as we don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Len in used internally in the SMT IIRC otherwise it wouldn't be in the interface. I'd have to check where though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@h5law just checked - Len()
is only used in tests. I wouldn't spend time on optimizing this call then.
func (sm *simpleMap) Len() int { | ||
return len(sm.m) | ||
func (sm *simpleMap) Len() (int, error) { | ||
return len(sm.m), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To jump on my comment earlier, I think having length return an error feels strange and removing the iteration in pebble will help with this, and make it faster - in pebbles case constant time at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're querying the length from the database - which is the source of truth for the data - this is not different from any other query. I'd say we should treat it as an SQL query COUNT(*)
- and as that operation can fail for any reason, we should return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@okdas Left a few comments but should be g2g after the changes.
Great job owning this! 🔥
Co-authored-by: Daniel Olshansky <[email protected]> Signed-off-by: Dima K. <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
Summary
We experience a high memory usage on RelayMiners, and it appears the consumption is primarily driven by badger.
Human Summary
AI Summary
reviewpad:summary
Issue
pokt-network/poktroll#551
Type of change
Please mark the relevant option(s):
Testing
make test_all
make benchmark_{all | suite name}
Required Checklist
godoc
format comments see: tip.golang.org/doc/comment)If Applicable Checklist